\section{Problem Definition}
\label{sec:def}

%\subsection{Notation} 
\noindent{\bf Notation.}
Let ${\cal X} = \{ 1, \ldots,n \}$ denote a set of $n$ individuals and ${\cal A} =$ \{$a_1, \ldots, a_m$\} denote a set of $m$ skills. Each individual $i$ is associated with a set of skills $X_i \subseteq {\cal A}$. If $a_j \in X_i$, then an individual $i$ has skill $a_j$. For each skill $a$, we define its support set, $S(a)$, as the set of individuals in $\cal X$ with skill $a$. That is, $S (a) = \{ i |  i \in {\cal X}$ and $a \in X_i \}$. A task $\cal T$ is a set of pairs where each pair, $<a_j$,$k_j>$, specifies that at least $k_j$ individuals of skill $a_j$ are required to perform the task.  

Let $G({\cal X}, E)$ denote the undirected, weighted graph representing the social network associated with the set of individuals ${\cal X}$. 
%The weight associated with each edge signifies the collaborative compatibility between the nodes they connect. 
We use the notations $E(G)$ and $V(G)$ to represent the edge set and vertex set associated with the graph $G$. If ${\cal X'} \subseteq V(G)$, we use $G[\cal X']$ to denote the subgraph of $G$ induced by the nodes in $\cal X'$. Further, $W(\cal X')$ denotes the sum of the edge-weights associated with all the edges in the subgraph induced by the nodes in $\cal X'$. We also define a distance function between any two node $i, i'$ in a graph $G$ as the sum of the edge-weights along the shortest path between $i$ and $i'$ in $G$. Further, without loss of generality, we assume that the graph $G$ is connected; we can transform every disconnected subgraph to a connected one by simply adding an edge that denotes zero collaborative compatibility. Given a measure of collaborative compatibility $Cc()$, we now formalize the problems in this paper.

%Problem $1$ [{\it Single Skill Team Formation (sTF)}]  \\
\noindent{\bf Single Skill Team Formation (sTF).} Given a set of $n$ individuals ${\cal X} = \{ 1, \ldots, n \}$, a graph $G({\cal X}, E)$, task ${\cal T} = \{<a, k>\}$, find $\cal X' \subseteq X$, such that $|{\cal X}'  \cap S(a)| \ge k$, and the collaborative compatibility $Cc(\cal X')$ is optimized. 

%Problem $2$ [{\it Multiple Skill Team Formation (mTF)}]  \\
\noindent{\bf Multiple Skill Team Formation (mTF).} Given a set of $n$ individuals ${\cal X} = \{ 1,\ldots, n \}$, a graph $G({\cal X}, E)$, task ${\cal T} =$ \{$<a_1, k_1>, <a_2, k_2>, \ldots, <a_m, k_m>$\}, find $ \cal X' \subseteq X$, such that $|{\cal X}'  \cap S(a_j)| \ge k_j$  for each $j \in \{ 1, \ldots, m \}$ and the collaborative compatibility $Cc(\cal X')$ is optimized.

%We define the following two metrics for collaborative compatibility mentioned in the problems {\it sTF} and {\it mTF}.\\ 

The main metric that we consider for collaborative compatibility for {\it sTF} and {\it mTF} is the following density based objective. In addition to this, we consider a diameter based objective as well (suggested in~\cite{LLT}) for comparison.

\noindent{\bf Maximum Density(D).} Given a graph $G({\cal X}, E)$ and a set of individuals $\cal X' \subseteq X$, we define the density collaborative compatibility of $\cal X'$, denoted by {\it Cc-D}$(\cal X')$ to be the density of the induced  subgraph $G[\cal X']$. Recall that the density $d(G)$ of a graph $G$ is defined as $d(G) = \frac{W(G)}{|V(G)|}$ . The higher the value of the density, the better is the collaborative compatibility. An optimal solution $\cal X' \subseteq X$, is the team that can perform task $\cal T$ and has maximum density. 

\noindent{\bf Minimum Diameter(R).} Given a graph $G({\cal X}, E)$ and a set of individuals $\cal X' \subseteq X$, we define the diameter collaborative compatibility of $\cal X'$, denoted by {\it Cc-R}$(\cal X')$, to be the diameter of the subgraph $G[\cal X']$. The diameter of a graph is the largest shortest path between any two nodes in the graph. 
%The lower the value of the diameter, the better is the collaborative compatibility. 
An optimal solution $\cal X' \subseteq X$, is the team that can perform task $\cal T$ and has minimum diameter. 

In the following sections, we refer to the {\it Single Skill Team Formation (sTF)} and {\it Multiple Skill Team Formation (mTF)} problems with collaborative compatibility {\it Cc-R} as {\it Diameter-sTF} and {\it Diameter-mTF}, respectively.  Similarly, for the collaborative compatibility {\it Cc-D} we refer to the corresponding problems as {\it Density-sTF} and {\it Density-mTF} respectively.

%\subsection{Properties}
\noindent{\bf Properties.} We now describe some properties of the maximum density objective. Notice that neither of these properties hold on {\it Diameter-sTF} or {\it Diameter-mTF}.
% (or a minimum spanning tree based objective suggested in~\cite{LLT}). 
%These properties are not satisfied by the minimum diameter objective.
For brevity, we mention the intuition without a rigorous definition or proof.

\noindent{\bf Strict Monotonicity.} If a communication edge (with positive weight) is added between two nodes in the solution set for the {\it Density-sTF} or {\it Density-mTF} problem, then the collaborative compatibility objective {\it Cc-D} for the solution necessarily increases. Similarly, if a communication edge already present is deleted, then the {\it Cc-D} objective value decreases. 
This seems intuitive as an added collaboration between two people in the team enhances the quality of the team. However, in the case of diameter, adding or deleting an edge may not affect the solution at all. 

\noindent{\bf Sensitivity.} The {\it Cc-D} value for {\it Density-sTF} or {\it Density-mTF} does not increase or decrease radically upon adding or deleting an edge. Specifically, it can only change to an extent depending on the weight of the added or deleted edge, compared to the total weight of edges in the solution. However, adding or deleting an edge can radically change the diameter (for example make it finite from infinite) for an induced subgraph; this implies that the diameter objective is highly sensitive to small change. 

%\noindent{\bf Topology independence.} The diameter-based objective may sometimes tend towards certain unfavorable topologies, e.g., a star (or star-like) graph, as such a solution would have a small diameter. However this solution may not necessarily imply a cohesive team because the center node of the star network may become a bottleneck. 
%and if it is removed the quality might degrade considerably. 
%The {\it Cc-D} value for {\it Density-sTF} or {\it Density-mTF} is not biased towards star-like topologies. However, it may be possible that density-based objectives also get biased to certain other undesirable topologies. 
%(DOES THIS POINT MAKE SENSE? NEED BETTER REPHRASING)

The properties for density based objectives fall out of the fact that adding or deleting edges only gradually alters the density of a solution subgraph. Diameter based objectives (or even the minimum spanning tree based objective suggested in~\cite{LLT} that we do not consider in this paper) are not smooth in this sense; altering the graph slightly can change the objective radically. These properties make density based objectives somewhat more suitable. One drawback, however, of density as an objective arises from the fact that the optimal solution may contain disconnected components. Notice that this is not the case for the diameter based objective, however, although the solution returned is connected it may be of large size including non-skilled (undesired) nodes that are required to ensure the connectivity. To ensure the connectivity property for the density-based solutions, in the experimental section we suggest several heuristic algorithms.
Eventually, the quality of teams produced by different definitions
% (as a set of skilled nodes as well as a potential for collaborative compatibility) 
needs to be evaluated (potential for collaboration) based on the measures neutral to these definitions; %we make such objective comparisons as well in the experimental section.
we present experimental results of such objective comparisons.